Molecular predictors of fungal infection

ABSTRACT

Methods for identifying fungal infection, assays for identifying genomic and protein markers of fungal infection, and methods for diagnosing the fungal infection. In one aspect, the method of identifying fungal infection by proteomic assay involves measuring the protein levels of proteins listed in Table 2A in a peripheral blood cell sample and comparing the determined protein levels to standard protein levels. In another aspect, subjects identified as being infected with fungal infection are treated with anti-fungals.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of priority to U.S. Provisional Patent Application Ser. No. 61/181,216 filed May 26, 2009, which is incorporated herein by reference in its entirety.

BACKGROUND

The invention relates to methods of identifying infectious disease infection prior to presentation of symptoms, assays for identifying genomic markers of infectious disease, and methods for diagnosing the underlying etiology of infectious disease. In some cases, the infectious disease results from bacterial infection. In other cases the infectious disease results from viral infection. In other cases the infectious disease results from fungal infection. The invention also relates to methods for identifying the nature of an infectious disease at the point-of-care.

Candida bloodstream infections (BSIs) cause significant morbidity and mortality amongst hospitalized patients. The gold standard diagnostic (blood culture) lacks sensitivity and can be slow to produce a diagnosis. Delay in therapy increases mortality roughly threefold. Therefore, there is an urgent need for new tools for early identification of Candida BSI. Studies of unique host response to pathogen classes may provide novel means of stratifying sepsis patients by the etiology of underlying infection.

Whole blood gene expression patterns, captured by micro arrays, offer a robust means of classifying infectious pathogens, and could provide a means of early and specific diagnosis well in advance of standard methods of detection of BSI. Rapid and accurate pathogen classification in this setting would allow for increased antimicrobial precision and perhaps identify specific pathways involved in host response to various infectious pathogens.

Peripheral blood leukocytes represent a reservoir and migration point for cells representing all aspects of the host immune response. Gene expression patterns obtained from peripheral blood cells can discriminate between various physiologic states as well as exposures to pathogens, immune modifiers (e.g., LPS), and environmental exposures. While current infectious disease diagnostics rely heavily on pathogen-based detection, the development of reproducible means for extracting whole blood RNA, coupled with advanced statistical methods for analysis of complex datasets, now allows the possibility of classifying infections based on host gene expression profiling that reveal pathogen specific signatures of disease.

SUMMARY OF THE INVENTION

In one aspect, a method of identifying a subject having candidiasis comprising, a) determining protein levels of at least three proteins in a peripheral blood cell sample of the subject, wherein the proteins are selected from Table 2A; and b) comparing the protein expression levels of the proteins to standard protein levels for the proteins, wherein a difference between the determined protein levels and standard protein levels is indicative of a subject having candidiasis.

In another aspect, a method of identifying a subject having candidiasis comprising, a) determining protein levels of at least three proteins in a peripheral blood cell sample of the subject, wherein the proteins are selected from Table 2A, and b) communicating the gene expression levels to a medical practitioner for the purpose of identifying a subject having candidiasis.

In another aspect, a method of treating a subject suspected of having candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) determining protein levels of at least three proteins in a peripheral blood cell sample of the subject, wherein the proteins are selected from Table 2A, c) comparing the protein expression levels of the proteins to standard protein levels for the proteins, wherein a difference between the levels of expression of the proteins and the standard protein levels is indicative of a subject having candidiasis, and d) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of treating a subject suspected of having candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) providing the peripheral blood cell sample to a laboratory for the purpose of determining protein levels of at least three proteins in the peripheral blood cell sample, wherein the proteins are selected from Table 2A, c) receiving data from the laboratory, the data indicating the protein levels of the at least three proteins selected from Table 2A, wherein a difference between the determined levels of expression of the proteins and the standard protein levels is indicative of a subject having candidiasis, d) determining whether the subject has candidiasis, and e) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of determining protein expression levels of at least three proteins in a peripheral blood cell sample, wherein the proteins are selected from Table 2A, comprising assaying the sample with a proteomic assay.

In another aspect, a method of identifying a subject having candidiasis comprising, a) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from Table 2A, and b) comparing the gene expression levels of the genes to standard gene expression levels for the genes, wherein a difference between the determine levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis.

In another aspect, a method of identifying a subject having candidiasis comprising, a) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from Table 2A, and b) communicating the gene expression levels to a medical practitioner for the purpose of identifying a subject having candidiasis.

In another aspect, a method of treating a subject suspected of having candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from Table 2A, c) comparing the gene expression levels of the genes to standard gene expression levels for the genes, wherein a difference between the determined levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis, and d) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of treating a subject suspected of having candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) providing the peripheral blood cell sample to a laboratory for the purpose of determining gene expression levels of at least three genes in a peripheral blood cell sample, wherein the genes are selected from Table 2A, c) receiving data from the laboratory, the data indicating the gene expression levels of the at least three genes are selected from Table 2A, wherein a difference between the determined levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis, d) deteitnining whether the subject has candidiasis, and e) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of determining gene expression levels of at least three genes in a peripheral blood cell sample, wherein the genes are selected from Table 2A in a genetic sample, comprising assaying the genetic sample with an array comprising a plurality of nucleic acid oligomers.

In another aspect, a method of distinguishing between candidiasis and bacteremia in a subject comprising, a) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from either Table 2B alone or Table 2C alone, and b) comparing the gene expression levels of the genes to standard gene expression levels for the genes, wherein a difference between the determine levels of expression of the genes and the standard gene expression levels is indicative of a subject with candidiasis or bacteremia.

In another aspect, a method of distinguishing between candidiasis and bacteremia in a subject comprising, a) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from either Table 2B alone or Table 2C alone, and b) communicating the gene expression levels to a medical practitioner for the purpose of distinguishing between candidiasis and bacteremia in a subject.

In another aspect, a method of treating a subject having an infection suspected to be either candidiasis or bacteremia comprising, a) obtaining a peripheral blood cell sample from the subject, b) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from either Table 2B alone or Table 2C alone, c) comparing the gene expression levels of the genes to standard gene expression levels for the genes, wherein a difference between the determined levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis or bacteremia, d) determining whether the subject has candidiasis or bactermia, and e) administering an effective amount of a therapeutic agent to the subject

In another aspect, a method of treating a subject having an infection suspected of being candidiasis or bacteremia comprising, a) obtaining a peripheral blood cell sample from the subject, b) providing the peripheral blood cell sample to a laboratory for the purpose of determining gene expression levels of at least three genes in a peripheral blood cell sample, wherein the genes are selected from either Table 2B alone or Table 2C alone, c) receiving data from the laboratory, the data indicating the gene expression levels of the at least three genes are selected from either Table 2B alone or Table 2C alone, wherein a difference between the determine levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis or bacteremia, d) determining whether the subject has candidiasis or bacteremia, and e) administering an effective amount of a therapeutic agent to the subject.

In another aspect, a method of determining gene expression levels of at least three genes in a peripheral blood cell sample, wherein the genes are selected from either Table 2B alone or Table 2C alone in a genetic sample, comprising assaying the genetic sample with an array comprising a plurality of nucleic acid oligomers.

In another aspect, a method of determining whether a subject has early, mid- or late stage candidiasis comprising, a) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from Table 2D for early stage candidiasis, Table 2E for mid-stage candidiasis, and Table 2F for late stage candidiasis, and b) comparing the gene expression levels of the genes to early stage standard gene expression levels, mid-stage standard gene expression levels, and late stage standard gene expression levels for the genes, wherein gene expression levels having 80% or greater similarity to the early stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having early stage candidiasis, wherein gene expression levels having 80% or greater similarity to the mid-stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having mid-stage candidiasis, and wherein gene expression levels having 80% or greater similarity to the late stage standard gene expression and less than 80% similarity to the early and mid-stage standard gene expression levels are indicative of a subject having late stage candidiasis.

In another aspect, a method of determining whether a subject has early, mid- or late stage candidiasis comprising, a) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from Table 2D for early stage candidiasis, Table 2E for mid-stage candidiasis, and Table 2F for late stage candidiasis, and b) communicating the gene expression levels to a medical practitioner for the purpose of identifying a subject having early, mid- or late stage candidiasis.

In another aspect, a method of treating a subject with candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are selected from Table 2D for early stage candidiasis, Table 2E for mid-stage candidiasis, and Table 2F for late stage candidiasis, c) comparing the gene expression levels of the genes to early stage standard gene expression levels, mid-stage standard gene expression levels, and late stage standard gene expression levels for the genes, wherein gene expression levels having 80% or greater similarity to the early stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having early stage candidiasis, wherein gene expression levels having 80% or greater similarity to the mid-stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having mid-stage candidiasis, and wherein gene expression levels having 80% or greater similarity to the late stage standard gene expression and less than 80% similarity to the early and mid-stage standard gene expression levels are indicative of a subject having late stage candidiasis, and d) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of treating a subject with candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) providing the peripheral blood cell sample to a laboratory for the purpose of determining gene expression levels of at least three genes in a peripheral blood cell sample, wherein the genes are selected from Table 2D for early stage candidiasis, Table 2E for mid-stage candidiasis, and Table 2F for late stage candidiasis, c) receiving data from the laboratory, the data indicating the gene expression levels of the at least three genes are selected from Table 2D for early stage candidiasis, Table 2E for mid-stage candidiasis, and Table 2F for late stage candidiasis, wherein gene expression levels having 80% or greater similarity to the early stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having early stage candidiasis, wherein gene expression levels having 80% or greater similarity to the mid-stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having mid-stage candidiasis, and wherein gene expression levels having 80% or greater similarity to the late stage standard gene expression and less than 80% similarity to the early and mid-stage standard gene expression levels are indicative of a subject having late stage candidiasis, d) determining whether the subject has early or late candidiasis, and e) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of determining gene expression levels of at least three genes in a peripheral blood cell sample, wherein the genes are selected from Table 2D, Table 2E or Table 2F in a genetic sample, comprising assaying the genetic sample with an array comprising a plurality of nucleic acid oligomers.

In another aspect, a method of identifying a subject having candidiasis comprising, a) determining gene expression levels of genes in a peripheral blood cell sample of the subject, wherein the genes are involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling, and b) comparing the gene expression levels of the genes involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling to standard gene expression levels for the genes, wherein a difference between the determined levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis.

In another aspect, a method of identifying a subject having candidiasis comprising, a) determining gene expression levels of genes in a peripheral blood cell sample of the subject, wherein the genes are involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling, and b) communicating the gene expression levels to a medical practitioner for the purpose of identifying a subject having candidiasis.

In another aspect, a method of identifying a subject having candidiasis comprising, a) determining in a peripheral blood cell sample of the subject the activities of the IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling pathways, and b) comparing the activities of the IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling pathways to standard activities for the pathways, wherein a difference between the determined activities and the standard activities is indicative of a subject having candidiasis.

In another aspect, a method of treating a subject suspected of having candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) determining gene expression levels of at least three genes in a peripheral blood cell sample of the subject, wherein the genes are involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling, c) comparing the gene expression levels of the genes to standard gene expression levels for the genes, wherein a difference between the determined levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis, and d) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of treating a subject suspected of having candidiasis comprising, a) obtaining a peripheral blood cell sample from the subject, b) providing the peripheral blood cell sample to a laboratory for the purpose of determining gene expression levels of genes in a peripheral blood cell sample of the subject, wherein the genes are involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling, c) receiving data from the laboratory, the data indicating the gene expression levels of genes involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling, wherein a difference between the determined levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis, and d) administering an effective amount of an anti-fungal agent to the subject.

In another aspect, a method of determining gene expression levels of genes in a genetic sample, wherein the genes are involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling, comprising assaying the genetic sample with an array comprising a plurality of nucleic acid oligomers.

In another aspect, a method of screening a compound for efficacy against candidiasis, comprising, a) determining first gene expression levels of at least three genes from cell culture, wherein the genes are selected from Table 2A, b) inoculating the cell culture with Candida, c) determining second gene expression levels of the genes from the cell culture at a time later than step b) to determine the infection status of the cell culture, wherein an increase between the first and second gene expression levels is indicative of a cell culture having candidiasis, d) contacting the infected cell culture with a compound, and e) determining third gene expression levels of the genes from the cell culture at a time later than step d) to determine the infection status of the cell culture, wherein a decrease between the second and third gene expression levels is indicative of a compound having efficacy against candidiasis.

In another aspect, a method of screening a compound for therapeutic efficacy against candidiasis, comprising, a) determining first gene expression levels of at least three genes from a first peripheral blood cell sample of a test subject, wherein the genes are selected from Table 2A, b) inoculating the test subject with Candida, c) determining second gene expression levels of the genes from a second peripheral blood cell sample of the test subject at a time later than step b) to determine the infection status of the test subject, wherein an increase between the first and second gene expression levels is indicative of a test subject having candidiasis, d) administering the compound to the test subject having candidiasis, and e) determining third gene expression levels of the genes from a third peripheral blood cell sample of the test subject at a time later than step d) to determine the infection status of the test subject, wherein a decrease between the second and third gene expression levels is indicative of a compound having therapeutic efficacy against candidiasis.

In another aspect, a method of determining the pharmacokinetic activity of a compound against candidiasis, comprising, a) determining first gene expression levels of at least three genes from a first peripheral blood cell sample of a test subject, wherein the genes are selected from Table 2A, b) inoculating the test subject with Candida, c) determining second gene expression levels of the genes from a second peripheral blood cell sample of the test subject at a time later than step b) to determine the infection status of the test subject, wherein an increase between the first and second gene expression levels is indicative of a test subject having candidiasis, d) administering the compound to the test subject having candidiasis, and e) determining third gene expression levels of the genes from a third peripheral blood cell sample of the test subject at a time later than step d) to determine the infection status of the test subject, wherein a decrease between the second and third gene expression levels is indicative of a compound having therapeutic efficacy against candidiasis.

In another aspect, a method for determining an effective dose of a compound to effectively reduce a Candida infection in a subject, comprising, a) determining first gene expression levels of at least three genes from a first peripheral blood cell sample of a test subject, wherein the genes are selected from Table 2A, b) inoculating the test subject with Candida, c) determining second gene expression levels of the genes from a second peripheral blood cell sample of the test subject at a time later than step b) to determine the infection status of the test subject, wherein an increase between the first and second gene expression levels is indicative of a test subject having candidiasis, d) administering a dose of the compound to the test subject having candidiasis, e) determining third gene expression levels of the genes from a third peripheral blood cell sample of the test subject at a time later than step d), and f) comparing the third gene expression levels to the first gene expression levels to determine if the dose effectively reduced the Candida infection in the subject.

In another aspect, a computer readable medium comprising standard gene expression levels of least three genes encoding a protein selected from Table 2A and responsivity information indicating changes in the gene expression levels of the at least three genes when a subject has candidiasis.

In another aspect, a computer readable medium comprising standard gene expression levels of least three genes encoding a protein selected from Table 2B or Table 2C and responsivity information indicating changes in the gene expression levels of the at least three genes when a subject has candidiasis or bacteremia.

In another aspect, a computer readable medium comprising standard gene expression levels of least three genes encoding a protein selected from Table 2D, Table 2E or Table 2F and responsivity information indicating changes in the gene expression levels of the at least three genes when a subject has early, mid- or late stage candidiasis.

In another aspect, a computer readable medium comprising standard gene expression levels of genes involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling and responsivity information indicating changes in the gene expression levels of genes involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling when a subject has candidiasis.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows survival curves for dose-finding studies for candidemia (FIG. 1A) and for Staphylococcus aureus bacteremia (FIG. 1B).

FIG. 2 illustrates the experimental design used to determine candidemia specific signatures.

FIG. 3 shows a kidney fungal burden at 24, 48, 72, and 96 hours after injection (FIG. 3A) and a kidney bacterial burden (S. aureus) at 24, 48, and 72 hours after infection (FIG. 3B).

FIG. 4 illustrates a workflow for statistical analysis.

FIG. 5 shows a factor plot showing distinction between control mice and mice with candidemia for training and validation cohorts.

FIG. 6 shows an algorithm generated predictive models of infection with C. albicans using 100 repetitions of 10-fold cross-validation using the entire probe set.

FIG. 7 shows an unsupervised analysis of the candidemic mice and controls alone and the entire probe set.

FIG. 8 shows a heat map of the 82 genes contained in the “candidemia versus control” (CavC) factor (group of coexpressed genes) that discriminates between control mice and mice with candidemia for training and validation cohorts (FIG. 8A), and the factor plot (FIG. 8B).

FIG. 9 shows an algorithm generated predictive model of infection with C. albicans using 100 repetitions of 10-fold cross-validation with probability of a sample representing a mouse with candidemia shown on the y axis.

FIG. 10 shows gene expression signatures using an unsupervised analysis of the candidemic mice and controls alone with the 24-hour factor (FIG. 10A), the midcourse infection (48 hours) factor (FIG. 10B), and the late infection factor (FIG. 10C), or a nonlinear model in a supervised fashion to predict the capacity of the “days after injection” factors (FIG. 10D).

FIG. 11 illustrates a proposed schemata for using peripheral blood gene expression signature for candidemia diagnosis.

FIG. 12 shows sparse factor regression to discover time varying expression patterns between candidemic and control mice.

DETAILED DESCRIPTION

The present invention relates to methods of identifying infectious disease that results from fungal infection. The invention also relates to methods for identifying the nature of the infectious disease at the point-of-care. By point-of-care it is meant at or near to the site of patient care, such as in a health provider office or clinic or in a hospital.

Candida bloodstream infections (BSIs) cause significant morbidity and mortality amongst hospitalized patients. The gold standard diagnostic (blood culture) lacks sensitivity and can be slow to produce a diagnosis. Delay in therapy increases mortality roughly threefold. Therefore, there is an urgent need for new tools for early identification of Candida BSI. Studies of unique host response to pathogen classes may provide novel means of stratifying sepsis patients by the etiology of underlying infection.

Early interventions have been shown to improve outcome of candidemia and invasive cadidiasis (IC). The clinical presentation of candidemia does not vary considerably from other nosocomial bloodstream infections. Physician judgment on the risk of candidemia, potentially aided by clinical prediction rules, is the primary driver for empiric antifungal therapy. Therefore, the differentiation between candidemia and bacteremia is often delayed by 48 to 72 hours until culture results are available. This strategy is also hampered by lack of sensitivity of the gold standard diagnostic test—the blood culture (sensitivity, about 50%). Because clinical prediction rules allow for identification of only a minority of individuals with IC, there are no optimal means for triggering preemptive therapeutic decisions. Although molecular testing such as measurement of β-D-glucan, a fungal cell wall by-product, is used as a marker for invasive fungal disease, the sensitivity of a single test is only 0.64% (95% confidence interval, 0.55-9.72). Thus, there is a need for novel early tests to diagnose candidemia and IC.

Whole blood gene expression patterns, captured by micro arrays, offer a robust means of classifying infectious pathogens, and provide a means of early and specific diagnosis well in advance of standard methods of detection of BSI. Rapid and accurate pathogen classification in this setting would allow for increased antimicrobial precision and perhaps identify specific pathways involved in host response to various infectious pathogens.

Peripheral blood leukocytes represent a reservoir and migration point for cells representing all aspects of the host immune response. Gene expression patterns obtained from peripheral blood cells can discriminate between various physiologic states as well as exposures to pathogens, immune modifiers (e.g., LPS), and environmental exposures. While current infectious disease diagnostics rely heavily on pathogen-based detection, the development of reproducible techniques for extracting whole blood RNA, coupled with advanced statistical methods for analysis of complex datasets, now allows the possibility of classifying infections based on host gene expression profiling that reveal pathogen specific signatures of disease.

The present invention provides a method of screening and treating subjects with fungal infection, including but not limited to Candidiasis. The invention can be used in a hospital, physician's office or clinic by persons who want to know if a subject has a fungal infection. The method of this invention is likely to be practiced in humans but is envisioned to be used in other mammals or organisms that are susceptible to fungal infection. The present invention provides a method for early detection of Candidiasis in subjects. The present invention also provides a method of determining whether a subject has candidiasis or bacteremia. In addition, the present invention provides a method of determining whether a subject has early, mid- or late stage candidiasis.

To fully realize the potential of genome-scale information requires a paradigm shift in the way complex, large-scale data is viewed, analyzed and utilized. The biology of infection, the host response and the ensuing disease process are hugely complex. Previous work in defining the complexity of the cancer phenotype using gene expression analysis has defined approaches involving successive sub-categorization of patients according to combinations of both clinical and genomic risk factors, highlighting the predictive value of multiple genomic patterns (Acharya, C. R., Hsu, D. S., Anders, C. K., Anguiano, A., Salter, K. H., Walters, K. S., Redman, R. C., Tuchman, S. A., Moylan, C. A., Mukherjee, S., et al. Gene expression signatures, clinicopathological features, and individualized therapy in breast cancer. Jama 299, 1574-1587 (2008); Garman, K. S., Acharya, C. R., Edelman, E., Grade, M., Gaedcke, J., Sud, S., Barry, W., Diehl, A. M., Provenzale, D., Ginsburg, G. S., et al. A genomic approach to colon cancer risk stratification yields biologic insights into therapeutic opportunities. Proc Natl Acad Sci USA 105, 19432-19437 (2008); Xu, M., Kao, M. C., Nunez-Iglesias, J., Nevins, J. R., West, M., and Zhou, X. J. An integrative approach to characterize disease-specific pathways and their coordination: a case study in cancer. BMC Genomics 9 Suppl 1, S12 (2008), all of which are incorporated herein by reference in their entirety.) The role of formal statistical models to incorporate, evaluate and weigh multiple gene expression patterns is key; it has been shown that specific classes of statistical tree models are capable of such synthesis and can improve prediction and classification for individual patients. One core methodology that underlies comprehensive clinical-molecular models uses statistical prediction tree models, and the gene expression data enters into these models as gene expression signatures (estimated “factors”) that are candidate predictive factors in statistical tree models.

Alterations in gene, protein and metabolite expression in blood in response to pathogen exposure are the basis for screening tests of the invention. Human models of pathogen exposure exist and murine models of bacterial and viral respiratory infections are well established and are an ideal means for defining gene expression patterns that are host “signatures” of the infectious prodrome. Murine and human data from peripheral blood may be used as a diagnostic window into host response to infectious challenges. Using this data, combined with literature markers, a Bayesian predictive model can be established. Bayesian modeling techniques are described in U.S. patent application Ser. Nos. 10/291,878, 10/692,002 and 12/406,751 and PCT Patent Application No. PCT/US03/33656 and PCT/US03/33946, which are incorporated by reference herein in their entirety.

The terms “anti-fungal” and “anti-fungal agent” refer to any compound, substance or agent used in the treatment of fungal condition, disease, infection or colonization. It includes fungicidal as well as fungistatic compounds which act on fungi in vitro, as well as in vivo. Examples of anti-fungal agents include amphotericin B, nystatin, fluconazole, itraconazole, naftifine, ketoconazole, 5-fluorocytosine and griseofulvin. The anti-fungals of the present invention are not limited to any particular mechanism of action. Nor is an understanding of the mechanism of action necessary to use the anti-fungals of the present invention.

The term “indicative” when used with gene expression levels means that the gene expression levels are up-regulated or down-regulated, altered, or changed compared to the standard gene expression levels. The term “indicative” when used with protein levels means that the protein levels are higher or lower, increased or decreased, altered, or changed compared to the standard protein levels.

The term “standard gene expression levels” refers to the gene expression levels in a subject or member of a population that does not have candidiasis, for example, a subject or member that is not infected with Candida. The term “standard protein levels” refers to the protein levels in a subject or member of a population that does not have candidiasis, for example, a subject or member that is not infected with Candida. The factors for determining a population include race, gender, age, geographic location and ethnic origin. In one embodiment, the standard gene expression levels for the genes are the average expression levels of the genes for a non-infected population to which the subject belongs, e.g., adult American female or male, or for a particular subject prior to being infected. A difference between the levels of expression of the genes and the standard gene expression levels is indicative of a subject having candidiasis. For example, a peripheral blood sample may be obtained from a subject at a medical laboratory, the blood sample worked up and screened for gene expression, the results of the screening compared to the gene expression standards, and the subject informed of her infectious status.

The term “bacteremia standard gene expression levels” refers to the gene expression levels in subject or member of a population that has bacteremia, for example, a subject or member of a population that is infected with Staphylococcus aureus. The standard gene expression levels for the genes are the average expression levels of the genes for a non-infected population to which the subject belongs, e.g., adult American female or male, or for a particular subject prior to being infected. In one embodiment, a difference between the levels of expression of the genes and the bacteremia standard gene expression levels is indicative of a subject having candidiasis. In another embodiment, a difference between the levels of expression of the genes and the bacteremia standard gene expression levels is indicative of a subject having bacteremia.

The terms “early stage,” “mid-stage,” and “late stage” refer to infection severity. For example, 1 day after the subject is infected with Candida represents a subject with “early stage candidiasis,” 2 days after the subject is infected with Candida represents a subject with “mid-stage candidiasis,” and 3 and 4 days after the subject is infected with Candida represents a subject with “late stage candidiasis.”

The term “early stage standard gene expression levels,” “mid-stage standard gene expression levels,” and “late stage standard gene expression levels” refers to the gene expression levels in a subject with early stage, a subject with mid-stage, and a subject with late stage candidiasis, respectively. In one embodiment, gene expression levels having 80% or greater similarity to the early stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having early stage candidiasis. In another embodiment, gene expression levels having 80% or greater similarity to the mid-stage standard gene expression levels and less than 80% similarity to the mid- and late stage standard gene expression levels are indicative of a subject having mid-stage candidiasis. In another embodiment, gene expression levels having 80% or greater similarity to the late stage standard gene expression levels and less than 80% similarity to the early and mid-stage standard gene expression levels are indicative of a subject having late stage candidiasis.

The term “Candida” refers to any Candida species that can cause infection or disease when introduced into a subject. In some embodiments, the Candida species is Candida albicans.

The term “candidiasis” refers to any infection cause by Candida. Examples of candidiasis include but are not limited to oral candidiasis (thrush), candidemia, invasive candidiasis, candidal vulvovaginitis, candidal intertrigo, diaper candidiasis, congenital cutaneous candidiasis, perianal candidiasis, candidal paronychi a, erosio interdigitalis blastomycetica, chronic mucocutaneous candidiasis, systemic candidiasis, candidid, and antibiotic candidiasis (latrogneic candidiasis).

The term “subject” refers to any animal being examined, studied or treated. It is not intended that the present invention be limited to any particular type of subject. In some embodiments of the present invention, humans are the preferred subject, while in other embodiments nonhuman animals are the preferred subject, including but not limited to mice, monkeys, ferrets, cattle, sheep, goats, pigs, chicken, turkeys, dogs, cats, horses and reptiles.

The terms “array,” “microarray” and “micro array” are interchangeable and refer to an arrangement of a collection of nucleotide sequences in a centralized location. Arrays can be on a solid substrate, such as a glass slide, or on a semi-solid substrate, such as nitrocellulose membrane. The nucleotide sequences can be DNA, RNA, or any permutations thereof. The nucleotide sequences can also be partial sequences from a gene, primers, whole gene sequences, non-coding sequences, coding sequences, published sequences, known sequences, or novel sequences.

The term “effective amount” refers to an amount of a therapeutic agent that is sufficient to exert a physiological effect in the subject.

The term “computer readable medium” refers to any device or system for storing and providing information (e.g., data and instructions) to a computer processor. Examples of computer readable media include, but are not limited to, DVDs, CDs hard disk drives, magnetic tape and servers for streaming media over networks. In various embodiments, aspects of the present invention including data structures and methods may be stored on a computer readable medium.

The term “responsivity” refers to a change in gene expression levels of genes in a subject in response to the subject being infected with candidiasis compared to the gene expression levels of the genes in a subject that is not infected with candidiasis or a control subject.

The term “peripheral blood sample” refers to a sample of cardiology blood circulating in the system or body taken from the system of body.

The term “genetic material” refers to a material used to store genetic information in the nuclei or mitochondria of an organism's cells. Examples of genetic material include, but are not limited to double-stranded and single-stranded DNA, RNA, and mRNA.

The term “plurality of nucleic acid oligomers” refers to two or more nucleic acid oligomers, which can be DNA or RNA.

As used herein, the terms “treat”, “treatment” and “treating” refer to the reduction or amelioration of the severity, duration and/or progression of a disease or disorder or one or more symptoms thereof resulting from the administration of one or more therapies. Such terms refer to a reduction in the replication of a fungus or bacteria, or a reduction in the spread of a fungus or bacteria to other organs or tissues in a subject or to other subjects.

The term “therapeutic agent” refers to a substance capable of producing a curative effect in a disease state. For example, a therapeutic agent for treating a subject having bacteremia is an antibiotic which include but are not limited to penicillins, cephalosporins, fluroquinolones, tetracyclines, macrolides, and aminoglycosides. A therapeutic agent for treating a subject having candidiasis is an anti-fungal.

The methods and assays of the invention may be based upon three different molecular platforms: RNA expression, metabolomic expression, and proteomic expression.

RNA from whole blood from humans and mice can be collected using PAXgene™ RNA tubes (PreAnalytiX, Valencia, Calif.). The RNA can be extracted using a standard Versagene™ (Gentra Systems, Inc, Minneapolis, Minn.) RNA extraction protocol. The Versagene™ kit produces greater yields of higher quality RNA from the PAXgene™ RNA tubes. Following RNA extraction, GLOBINClcar™ (Ambion, Austin, Tex.) for whole blood globin reduction can be used. (This method uses a bead-oligonucleotide construct to bind globin mRNA). Quality of the RNA can be assessed by several techniques. First, RNA quality can be assessed using the Agilent 2100 Bioanalyzer immediately following extraction. This analysis provides an RNA Integrity Number (RIN) as a quantitative measure of RNA quality. Second, following globin reduction, the samples can be compared to the globin-reduced standards. Finally, the scaling factors and background can be assessed following hybridization to the microarrays. Processed RNA can be undergo automated cRNA probes production and hybridization using the Affymetrix GeneChip™ High Throughput (HT) Array System for whole-genome transcript analysis. The Affymetrix HT A system uses the Affymetrix U133A gene set of over 22,000 separate transcripts.

Metabolomic assays can be performed by gas chromatography/mass spectrometry (GC/MS) or tandem mass spectrometry (MS/MS) to measure a number of distinct analytes. Pre-packaged tubes that contain a protease inhibitors cocktail may be used for collection.

Tandem mass spectrometry can also be used to measure the following 15 distinct amino acid species: lysine, alanine, serine, praline, valise, leucine/isoleucine (an isobaric pair assayed as a single amino acid), methionine, histidinc, phenylalaninc, tyrosine, aspartate, glutamate, ornithine, citrulline, and arginine.

Proteomic assay methods include but are not limited to chromatography, mass spectrometry, antibody assays, and protein arrays.

Biased proteomic assays including, but are not limited to, inflammatory cytokine profiling, can also be performed. Biased Proteomic assay are not limited to the methods described herein. Multiplex and ELISA techniques can be used to measure levels of IL-1-beta, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-10, IL-12, IL-13, IL-17, interferon (IFN)-gamma), TNF-alpha), granulocyte colony stimulating factor (G-CSF), granulocyte macrophage colony stimulating factor (GM-CSF), monocyte chemoattractant protein-I (MCP-I), macrophage inflammatory protein (MIP)-1-beta and C-reactive protein. Additionally, multiplex and ELISA techniques can be used to measure levels of the following hormones involved in metabolic homeostasis and energy balance: insulin, glucagon, leptin, adiponectin, ghrelin, resistin, and insulin-like growth factor-I. Multiplex antibody assays, such as those described in PCT/US10/23917, incorporated herein by reference in its entirety, may also be used to assay proteins in conjunction with the methods of the invention.

An in vivo blood RNA-based signature that discriminates between candidemia and bacteremia, as well as between infected mice and healthy controls, has been developed and validated. The results provide clear evidence that a unique blood gene expression signature classifies candidemia with a remarkable degree of accuracy. The method of the present invention is able to discriminate between different durations of time after tail vein injection of C. albicans using such distinct blood gene expression signatures, and these data have been used as predictors in new samples. These gene expression signatures were determined in an unsupervised manner, meaning that the model determined relevant co-expressing genes without a priori information about the samples and their labels. Thus, relevant genes clustered together and were then able to predict the type or duration of infection in rigorous cross-validation. These findings provide compelling evidence that blood gene expression can accurately discriminate classes of infectious pathogens, specifically fungal infection from a common bacterial bloodstream infection, and may potentially serve as a useful diagnostic for triaging treatment decisions for nosocomial infections, specifically candidemia.

In one embodiment, the invention provides methods for identifying invasive candidiasis in a subject. The method comprises screening the gene expression levels in a peripheral blood cell sample for expression of genes indicating IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling. The expression of genes indicating IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling are indicative of the infection status of the subject. The methods may allow a subject infected with candida to be identified before the subject develops symptoms of invasive candidiasis.

In another embodiment, the invention provides methods for identifying invasive candidiasis in a subject. The method comprises screening the gene expression levels in a peripheral blood cell sample for at least three (including four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, twenty-two, twenty-three, twenty-four, twenty-five, twenty-six, twenty-seven, twenty-eight, twenty-nine, thirty) genes encoding a protein listed in Table 1. The expression levels of genes encoding the proteins listed in Table 1 are indicative of the infection status of the subject. The method may comprise screening for at least five (including six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, twenty-two, twenty-three, twenty-four, twenty-five, twenty-six, twenty-seven, twenty-eight, twenty-nine, thirty) genes encoding a protein listed in Table 1. The method may comprise screening for at least ten (including eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, twenty-two, twenty-three, twenty-four, twenty-five, twenty-six, twenty-seven, twenty-eight, twenty-nine, thirty) genes encoding a protein listed in Table 1. The methods may allow a subject infected with candida to be identified before the subject develops symptoms of invasive candidiasis.

TABLE 1 Proteins indicative of candidemic infection CD151 antigen melanoma associated antigen Paired-Ig-like receptor B type 1 tumor necrosis factor receptor DnaJ (Hsp40) homolog, subfamily early growth response 1 Golgi autoantigen, golgin subfamily B-cell receptor-associated protein Proteasome (prosome, macropain) lymphocyte antigen 6 complex 3-phosphoinositide dependent protein E74-like factor 1 YTH domain family 1 cell division cycle 37 homolog inosine 5′-phosphate dehydrogenase LIM and senescent cell antigen-1 CD83 antigen Fas-associated factor 1 interleukin 2 receptor, gamma chain MAP kinase-interacting serine/ threonine kinase 2 early B-cell factor 1 golgi autoantigen, golgin subfamily Sjogren syndrome antigen B exosome component 9 CD164 antigen glomulin, FKBP associated protein small inducible cytokine subfamily B-cell leukemia/lymphoma 10

In another embodiment, the invention provides methods for distinguishing subjects infected with invasive candidiasis from subjects having bacterial and/or viral infections. The method comprises screening the gene expression levels in a peripheral blood cell sample for expression of genes indicating IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling. The expression of genes indicating IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling are indicative of the infection status of the subject.

In another embodiment, the invention provides methods for distinguishing subjects infected with invasive candidiasis from subjects having bacterial and/or viral infections. The method comprises screening the gene expression levels in a peripheral blood cell sample for at least three (including four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, twenty-two, twenty-three, twenty-four, twenty-five, twenty-six, twenty-seven, twenty-eight, twenty-nine, thirty) genes encoding a protein listed in Table 1. The expression levels of genes encoding the proteins listed in Table 1 are indicative of the infection status of the subject. The method may comprise screening for at least five genes encoding a protein listed in Table 1. The method may comprise screening for at least ten genes encoding a protein listed in Table 1.

In another embodiment, the invention provides assays for identifying invasive candidiasis in a subject. The assay includes determining the gene expression level for genes indicating IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling. The expression levels of genes indicating IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling are indicative of the infection status of the subject. The assay may allow a subject infected with candida to be identified before the subject develops symptoms of invasive candidiasis.

In another embodiment, the invention provides assays for identifying invasive candidiasis in a subject. The assay includes determining the gene expression levels in a peripheral blood cell sample for at least three (including four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, twenty-two, twenty-three, twenty-four, twenty-five, twenty-six, twenty-seven, twenty-eight, twenty-nine, thirty) genes encoding a protein listed in Table 1. The expression levels of genes encoding the proteins listed in Table 1 are indicative of the infection status of the subject. The assay may allow a subject infected with Candida to be identified before the subject develops symptoms of invasive candidiasis.

Recitation of the number of genes for which gene expression is determined is merely intended to serve as a shorthand method of referring individually to each separate gene listed in Table 1 as appropriate, and each separate gene is incorporated into the specification as if it were individually recited herein. It also is understood that any numerical range of genes recited herein includes all values from the lower value (e.g., three) to the upper value of genes listed in the appropriate column.

The person performing the peripheral blood sample need not perform the comparison, however, as it is contemplated that a laboratory may communicate the gene expression levels to a medical practitioner for the purpose of identifying a subject infected with a candidiasis. Additionally, it is contemplated that a medical professional, after examining a patient, would order an agent to obtain a peripheral blood sample, have the sample screened for gene expression and compared to standard values, and have the agent report patient's infection status to the medical professional. Once the medical professional has obtained the patient's infection status, the medical professional could order suitable treatment, for example anti-fungals.

In order to complete the comparison, the invention contemplates a computer readable medium comprising standard gene expression levels of least three genes encoding a protein selected from the group consisting of CD151 antigen, paired-Ig-like receptor B, DnaJ (Hsp40) homolog subfamily proteins, Golgi autoantigen, golgin subfamily proteins, proteasome, prosome, macropain, 3-phosphoinositide dependent protein, YTH domain family 1 proteins, inosine 5′-phosphate dehydrogenase, CD83 antigen, interleukin 2 receptor, gamma chain, early B-cell factor 1, Sjogren syndrome antigen B, CD164 antigen, small inducible cytokine subfamily proteins, melanoma associated antigen, type 1 tumor necrosis factor receptor, early growth response 1, B-cell receptor-associated protein, lymphocyte antigen 6 complex, E74-like factor 1, cell division cycle 37 homolog, LTM and senescent cell antigen-I, Fas-associated factor 1, MAP kinase-interacting serine/threonine kinase 2, golgi autoantigen, golgin subfamily proteins, exosome component 9, glomulin, FKBP associated protein, and B-cell leukemia/lymphoma 10 and responsivity information indicating changes in the gene expression levels of the at least three genes when a subject is infected with candidiasis. In other embodiments, the computer readable medium may have gene sets and responsivity data specific for genes involved in IL-10 signaling, LXR/RXR activation, and toll-like receptor signaling.

Other aspects of the invention can become apparent by consideration of the detailed description and accompanying drawings.

Before any embodiments of the invention are explained in detail, it is to be understood that the invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the following drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways.

Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any nonclaimed element as essential to the practice of the invention.

It also is understood that any numerical range recited herein includes all values from the lower value to the upper value. For example, if a concentration range is stated as 1% to 50%, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification. These are only examples of what is specifically intended, and all possible combinations of numerical values between and including the lowest value and the highest value enumerated are to be considered to be expressly stated in this application.

Further, no admission is made that any reference, including any patent or patent document, cited in this specification constitutes prior art. In particular, it will be understood that, unless otherwise stated, reference to any document herein does not constitute an admission that any of these documents forms part of the common general knowledge in the art in the United States or in any other country. Any discussion of the references states what their authors assert, and the applicant reserves the right to challenge the accuracy and pertinency of any of the documents cited herein.

EXAMPLES Example 1 Candida albicans and Staphylococcus aureus Inoculum Preparation

C. albicans SC5314 (American Type Culture Collection) and S. aureus Sanger 476 (a gift of V. G. Fowler) were used for inducing candidemia and bacteremia, respectively. C. albicans was grown in yeast-peptone-dextrose (YPD) broth for 12 to 17 hours at 25° C. and 225 rpm. The fungal broth was washed with sterile phosphate-buffered saline (PBS) and adjusted to a dose of 2×10⁵ colony-forming units (CFUs) per 200 μl of inoculum. This dose (“2×10e5”) had 50% mortality by day 6 and avoided death of animals during the 96-hour collection window (FIG. 1A). S. aureus was grown in trypticase soy broth (TSB) at 30° C. and 225 rpm with aeration overnight. Overnight broth (1 to 2 ml) was suspended in 100 ml of fresh TSB and shaken at 150 rpm for 1 to 2 hours at 30° C. Once in log phase, the bacterial broth was then harvested by centrifugation, washed with sterile PBS, and resuspended in an OD₆₀₀ of 0.20. This corresponded to a dose of 5×10⁷ CFU/ml (1×10⁷ CFU/200 μl of inoculum) and produced 50% mortality by day 4 after infection based on dose-finding experiments shown as “1.0×10c7” in FIG. 1B.

Using a tail vein injection model, dose-finding studies were performed to determine a dose of Candida albicans that produced 50% mortality at day 5 after injection (FIG. 1). Overall experimental design is shown in FIG. 2.

After injection with C. albicans, all infected mice lost weight over the course of the experiment and the weight loss correlated with the duration of infection (FIG. 3). Among the infected mice, average weight loss relative to baseline weight was 0.4±0.53 g for the mice killed at day 1 (24 hours) after injection, 1.39±0.6 g for the mice killed at day 2 (48 hours) after injection, and 2.6±0.99 g for the mice killed at day 3 or 4 (72 to 96 hours) after injection. On the other hand, control mice gained an average of 1.88±0.88 g of weight over the 96-hour duration of the experiment (P=6.05×10⁻⁷, Student's t test, for difference in average weight loss between day 1 and days 3 to 4; error bars represent SD). None of the mice met preset criteria for early euthanasia during the course of the experiment.

Example 2 C. albicans and S. aureus Infection in Mice

All murine work was approved by the Duke University Institutional Animal Use and Care Committee. An overview of experimental design is shown in FIG. 2. After determination of the optimal dose of C. albicans needed to produce 50% morbidity at 5 days after tail vein injection, a cohort of mice (BALB/cJ, male, weighing 20 to 25 g, age 8 weeks; The Jackson Laboratory) was infected for development of Candida bloodstream infection. Twenty-eight mice were injected with 2×CFU per inoculum suspended in PBS, and 12 mice were injected with the vehicle (PBS) via the tail vein. Daily weights and twice-daily activity levels were noted to assess the clinical status of the mice. Mice were then killed by CO₂ asphyxiation at predetermined time intervals of 24 hours (n=8 mice), 48 hours (n=6 mice), 72 hours (n=7 mice), and 96 hours (n=7 mice) after injection. Control mice were also killed on days 1 and 4 and processed identically to their experimental counterparts. Blood (500 μl) was collected from all mice by cardiac puncture and placed in RNAlater tubes provided from the Mouse RiboPure RNA Isolation kit per the manufacturer's instructions (Ambion). Finally, the right kidney was collected into sterile PBS, weighed, homogenized, serially diluted, and plated on YPD agar for colony count. A validation set of C. albicans-infected and control mice was set up in an identical manner. The murine S. aureus bacteremia model was also set up in an identical manner to the candidemic model, except that a dose of 5×10⁷ CFU/ml was used to induce bacteremia. Mice were monitored twice daily after injection and killed at predetermined time points (24, 48, or 72 hours after infection), and whole blood was collected and placed immediately into RNAlater tubes included with the Mouse RiboPure RNA Isolation kit per the manufacturer's instructions. The right kidney was removed aseptically and placed into sterile saline, weighed, homogenized, and plated on trypticase soy agar for colony count. Blood from mice with confirmed infection (by colony count) was used for RNA extraction and microarray analysis. Weight loss in the S. aureus-infected mice was progressive across the experiment, with average weight loss of 1.7±0.46 g at 24 hours alter injection, 4.6±0.53 g at 48 hours after injection, and 4.2±1.04 g at 72 hours after injection. The associated fungal (FIG. 3A) or bacterial burden (FIG. 3B).

Example 3 S. aureus Infection in Mice

A group of nine mice (BALB/cJ, male, weighing 20 to 25 g, age 8 weeks) received tail vein injection with 200 μl of S. aureus Sanger 476 (5×10⁷ CFU/ml). Three received PBS injection (200 μl), as described above. This produced a similar morbidity curve as the C. albicans infection (FIG. 1). Mice were killed at predetermined time points, and β-globinreduced whole-blood mRNA was hybridized to Affymetrix Mouse 430A 2.0 microarrays in a similar fashion to the candidemia cohorts. Colony counts from murine kidneys were plated on trypticase soy agar and counted at 24 hours to determine bacterial burden.

Example 4 RNA Preparation

Whole-blood RNA isolation and β-globin reduction were carried out using the manufacturer's protocol (Mouse RiboPure and GLOBINclear, Ambion). The amount and purity of RNA yield was analyzed using NanoDrop spectrophotometer (Thermo Fischer Scientific), and its integrity was analyzed using Agilent Bioanalyzcr. RNA from sample that met these quality checks (260/280 ratio >1.8, 260/230 ratio >1.8, and RNA integrity number >7) was amplified and biotin-labeled using MessageAmp Premier RNA Amplification kit (Ambion) according to standard protocols at the Duke University Microarray Core facility known in the arts. Amplification and hybridization onto Affymetrix murine 430A2.0 microarrays were performed by Duke University Microarray Core. Probe intensities were detected using Axon GenePix 4000B Scanner (Molecular Devices). Image files were generated using Affymetrix GeneChip Command Console software.

Example 5 Genomic Signatures for Identifying Invasive Candidiasis (IC)

Invasive candidiasis (IC) is the 4th most common cause of blood stream infection (BSI) in hospitalized patients. Given its high mortality 10-14%, there is urgent need to diagnose IC early and more accurately. Gene expression profiling (GEP) from RNA isolated from host blood provides an alternative approach for an early and accurate diagnostic for IC.

Murine models of invasive candidiasis are an ideal means for defining molecular expression patterns that are “signatures” of invasive candidiasis. The identification of specific expression signatures in a controlled laboratory environment provides a framework for separating true “signal” from the inherent “noise” generated in human gene, protein, and metabolite expression studies; it also allows for control of environmental factors. As such, gene expression signatures derived from sampling at distinct time points following pathogen exposure can aid in both discovery and validation of expression patterns seen in human endogenous infection.

Forty-three BALB/c (10 wks) were injected via tail with either 2E5 cfu. C. albicans strain SC5314 (n=31) or with PBS (n=12). The infected mice were sacrificed at 24, 48, 72, and 96 hours post infection. The control group was sacrificed at 24 and 96 hours. Blood was collected into RNA. Later (Ambion), underwent RNA isolation, β-globin reduction, and hybridization to Affymetrix 430A_2 arrays. Signatures characterizing IC were developed using sparse latent factor regression analysis (Bayesian Factor Regression Model).

Distinct factors (approx. 80 genes/factor) characterized infected vs uninfected mice, as well as early infection (48 hours) vs. late infection (72-96 hours) with >95% accuracy in an unsupervised analysis. Predictions remained valid when dividing the cohorts into training and validation sets. Leave-one-out cross validation identified infected versus uninfected mice with 95% accuracy (FIG. 12C). The model can also distinguish each day post infection from the other timepoints (FIG. 12C, Day 2) and performs well on a training and validation set (FIG. 12D). Sensitivity and specificity for determining infected versus uninfected mice are 100%. Additionally, early candidemia (FIGS. 12C and 12D, as well as day 1 data not shown) can be predicted from both uninfected mice and late time points of infection (days 3 and 4) with similar accuracy. Notably, the current 1-3-BDG testing has a sensitivity of approximately 80%. Genes found to be most predictive in these analyses are many immune-related genes, lending biologic plausibility to our findings. The genes in the infected vs uninfected factor were most strongly correlated with immune pathways of IL-10 signaling (p<10-7), LXR/RXR activation (p<10-4), and toll-like receptor signaling (p<10-3).

GEP yields signatures composed of factors that distinguish infected from control host with greater than 95% accuracy and can also predict the duration of infection within 72 hours. The genes identified are consistent with host response to infection and provide a robust alternative diagnostic indicator. Candidemic mice were distinguished from mice with Staphylococcus aureus bacteremia with key genes involved in this distinction including the following: CD151, Lilrb3, CD83, Ly6a and Scye1. (See, Table 1).

The infectious status of a subject exposed to Candida may also be identified by measuring gene expression levels for at least three, typically five, more typically ten genes which encode a protein from the list of proteins in Table 1. The expression of the genes may be determined using any methods known in the art for assaying gene expression. Gene expression may be determined by measuring mRNA or protein levels for the genes. In a preferred embodiment, an mRNA transcript of a gene may be detected for determining the expression level of the gene. Based on the sequence information provided by the GenBank™ database entries, the genes can be detected and expression levels measured using techniques well known to one of ordinary skill in the art. For example, sequences within the sequence database entries corresponding to polynucleotides of the genes can be used to construct probes for detecting mRNAs by, e.g., Northern blot hybridization analyses. The hybridization of the probe to a gene transcript in a subject biological sample can be also carried out on a DNA array. The use of an array is preferable for detecting the expression level of a plurality of the genes. As another example, the sequences can be used to construct primers for specifically amplifying the polynucleotides in, e.g., amplification-based detection methods such as reverse-transcription based polymerase chain reaction (RT-PCR). As another example, mRNA levels can be assayed by quantitative RT-PCR. Furthermore, the expression level of the genes can be analyzed based on the biological activity or quantity of proteins encoded by the genes.

This study demonstrated the ability to detect invasive candidiasis early, e.g., prior to signs of severe illness (weight loss >20% body weight, ruffled hair, hunched posture, minimal movement) or prior to documented candida infection by other methods (for example, culture).

Example 6 Statistical Analysis

Workflow for the statistical analysis is shown in FIG. 4. Expression Console (Affymetrix) was used to ensure that microarray data intensity files met quality control parameters. MASS (GeneChip Operating Software, Affymetrix) expression summary algorithm was used because it provided greater accuracy by eliminating noise from transcripts with low expression levels (M. Barenco, J. Stark, D. Brewer, D. Tomescu, R. Callard, M. Hubank, Correction of scaling mismatches in oligonucleotide microarray data. BMC Bioinforinatics 7, 251 (2006)). Next, expression data from samples that met the quality standards were used in Matrix Laboratory (“MATLAB”) to develop predictive models. The data were analyzed using sparse latent factor regression, a form of unsupervised analysis, to look for inherent patterns in the data (D. M. Seo, P. J. Goldschmidt-Clermont, M. West, Of mice and men: Sparse statistical modeling in cardiovascular genomics. Ann. Appl. Stat. 1, 152-178 (2007); J. E. Lucas, C. M. Carvalho, J.-Y. Chen, J.-I. Chi, M. West, Sparse statistical modeling in gene expression genomics, in Bayesian Inference for Gene Expression and Proteoinics (Cambridge Univ. Press, New York, 2006), pp. 155-176). Unsupervised analysis means that the analytical algorithm was not given any a priori information about the samples, such as infection status. Sparse factor regression was also used to discover time-varying expression patterns within a regression model framework according to the following relation:

y=mean+AS′+Error  (1)

In Eq. 1, y is a P×N-dimensional matrix of probe intensities, where P is the number of probe sets and N is the number of samples (mice). S is an N×K-dimensional matrix representing the latent effects or factors that are to be discovered from the time-varying (or infectious agent-varying) changes in probe intensities, and A is a P×K-dimensional matrix of factor loadings (regression coefficients). A sparsity criterion was used, whereby the model encourages many of the elements in the loadings matrix, A, to be zeros as long as the patterns observed in y are well explained by the product matrix AS' (i.e., the values in Error are minimized). One may view the fitting of this model as a dimension reduction technique, as the initial P features (in our case P>40,000) are being described with a set of K latent factors (K=40 in the model described here). Although it is possible to use a fixed-effects model to describe the known time and treatment effects, this requires that the fixed effects be known. In a complex experiment with more than two groups, this can be a challenge. Latent factor regression allowed the discovery of the behavior of genes across all experimental groups in a way that is data driven and does not require perfect knowledge of the expression patterns.

The details of fitting and inference for Bayesian sparse factor models (BFRM) are published (J. E. Lucas, C. M. Carvalho, J.-Y. Chen, J.-I. Chi, M. West, Sparse statistical modeling in gene expression genomics, in Bayesian Inference for Gene Expression and Proteomics (Cambridge Univ. Press, New York, 2006), pp. 155-176; A E. Raftery, D. Madigan, J. A Hoeting, Bayesian model averaging for linear regression models. J. Am. Stat. Assoc 92, 179-191 (1997); C. Hans, A Dobra, M. West, Shotgun stochastic search for “Large p” regression. J. Am. Stat. Assoc. 102, 507-516 (2007)) and were used to decompose the expression patterns of the mice from this experiment into a set of factors. This is a sparse factor regression, which builds factors based on coexpression of relatively large subsets of genes. As an initial method of dimension reduction, the analysis was restricted to a collection of 2039 probe sets that are associated with the immune response using GO categories as a basis for selection (listed in the Table 1), and from this subset, forty factors were obtained, each largely determined by the expression of between 13 and 475 probe sets (depending on the factor). The analysis was also performed using all probe sets present on the array (FIGS. 5-7), without changing the ability of the model to classify infectious etiology or to predict time after injection. The initialization files and output from BFRM are listed below:

MS 3000715

These are the files associated with SSS.

data.txt—The independent predictor variables. Each row is a different sample and each column is a different predictor

responses.txt—The dependent variable. The variable that is to be predicted.

itcrout.txt—The step by step summary of model changes that SSS outputs. Not important for later use.

nullfile.txt—the model fit information for the null model. (The model that assumes that the dependent variable is not related to the independent variables.)

weights.txt—For weighted regression. All ones for standard regression. Zeros for samples that are not being used in the regression (used for cross-validation).

sss.binary.txt—The parameter file for SSS.

modelsummary.txt—A list of the top 1000 models visited by the software in order from best to worst fit. The first column lists the number of variables in the model, the second lists the log-likelihood (including prior densities). The next n columns list the variables that were included in the model. The next n+1 columns list the posterior mean of the regression coefficients (including an intercept term), and the remaining (n+1)̂2 columns list the posterior covariance matrix of the regression coefficients.

To compute a model averaged prediction, compute the model fit for each model, then weight those predictions by exp (column 2). Full details of this software are available in Hans C, Dobra A, West M. Shotgun stochastic search for “Large p” regression. J. Am. Stat. Assoc. 102, 507-516 (2007).

The “model-averaged” prediction is then the weighted average of the predictions of each of the tested models. The weighting ensures that models that fit the training data best are more influential in the computation of the average. This approach has the advantage of better accounting for uncertainty in the choice of models and therefore gives more robust predictions. Additionally, because this analysis can keep track of all of the models, their weights, and their respective regression coefficients, the model can be applied to new data sets (for the purposes of prediction) in the same way that a standard regression is applied. These factors were then used as predictors in a linear regression with variable selection priors on the coefficients. This leads to a large collection of parsimonious models (involving one to four of the predictors). Predictions were then based on model averaging, which has been shown to better account for uncertainty in the model designation and which leads to superior predictive accuracy when compared to the single best model. A stochastic search implementation of model averaging, shotgun stochastic search (SSS) (Hans C, Dobra A, West M. Shotgun stochastic search for “Large p” regression. J. Am. Stat. Assoc. 102, 507-516 (2007)) was used, and an example of its use in this context (using factors computed with BFRM to predict outcomes) has been included in the Supplementary Material (“SSS example”). A tutorial on the use of this software can be found at http://www.stat.duke.edu/research/software/west/sss/readme.serial.html. The algorithm for generating predictive models of infection with Candida using 100 repetitions of 10-fold cross-validation was tested. This results in a range of possible predictions for each sample depending on which of the remaining samples were used to build the predictive model.

Example 7 Pathways Analysis

GATHER (J. T. Chang, J. R. Nevins, GATHER: A systems approach to interpreting genomic signatures. Bioinformatics 22, 2926-2933 (2006)) (http://gather.genome.duke.edu) and GeneGo (http://www.genego.com) were used for functional annotation of genes. The significance of the association between the data set and the canonical pathway was measured by Fisher's exact test to calculate a P value and by ratio of the genes from the data set that mapped to the pathway divided by the total number of genes in the pathway. Similar methodology was used in the construction of networks most associated with genes from the data set.

Example 8 A Blood RNA Gene Expression Signature Differentiated Mice with Candidemia from Uninfected Mice

To begin exploring a gene expression signature associated with C. albicans infection, an unsupervised analysis of the gene expression data on blood samples from mice exposed to C. albicans and control mice (n=28 infected; n=12 controls) was carried out. Sparse latent factor regression analysis was used to determine a set of genes (“factor” or “signature”) that most consistently coexpressed across samples from all infected mice (FIG. 4). As an initial method of dimension reduction, the analysis was restricted to a collection of 2039 probe sets that are associated with the immune response using Gene Ontology (GO) categories as a basis for selection.

The analysis was also performed using all probe sets present on the array (FIGS. 5-7), without changing the ability of the model to classify infectious etiology or to predict time after injection. FIG. 5 shows a factor plot showing distinction between control mice and mice with candidemia for training and validation cohorts. FIG. 6 shows the algorithm generated predictive models of infection with C. albicans using 100 repetitions of 10-fold cross-validation using the entire probe set. As shown in FIG. 7, the severity of infection (as represented by days after tail vein injection) can be predicted by gene expression signatures using an unsupervised analysis of the candidemic mice and controls alone and the entire probe set.

The sparse latent factor regression analysis produced 20 such sets of 60 to 80 coexpressing genes. From these factor sets, the most robust factor was selected that could differentiate infected from control samples. Hereafter referred to as the “candidemia versus control” (CavC) factor, this factor consists of 82 unique probes representing 67 unique genes (see Table 2A). The CavC factor had 97.5% accuracy in discriminating infected from uninfected samples, with sensitivity and specificity of 96% and 100%, respectively (FIGS. 8A-D). FIGS. 8A-B shows a heat map of the 83 genes contained in the factor (group of coexpressed genes) that discriminates between control mice and mice with candidemia for training and validation cohorts (FIG. 8A) and the factor plot (FIG. 8B). Blood gene expression signatures distinguish murine candidemia from healthy controls. As shown in FIG. 8A, the training cohort (controls killed at 24 hours indicated by black bar; controls killed at 96 hours indicated by gray bar) is at the left of the heat map; the validation cohort (all controls killed at 24 hours) is at the right. FIG. 8B shows the factor plot showing distinction between control mice and mice with candidemia for training and validation cohorts. As shown in the factor plot, infected mice demonstrate factor scores that are consistently distinct from controls. In both the training and validation cohorts, one sample per cohort is misclassified (an infected classified as a control).

TABLE 2A Candida vs Control Genes where the gene name is followed by the protein it encodes Adam19 Disintegrin and metalloproteinase Adam8 Disintegrin and metalloproteinase domain-containing protein 19 domain-containing protein 8 Anxa2 annexin A2 Bcl6 B-cell CLL/lymphoma 6 (zinc finger protein 51) Bst1 bone marrow stromal cell antigen 1 Ccr1 chemokine (C—C motif) receptor 1 Cd14 CD14 molecule Cd177 CD177 molecule Cd300Lf CD300 molecule-like family member f Cd33 CD33 antigen Cd52 CD52 molecule Cklf chemokine-like factor Crlf2 cytokine receptor-like factor 2 Csf2Rb colony stimulating factor 2 receptor, beta, low-affinity Csf3R colony stimulating factor 3 receptor Ebi3 Epstein-Barr virus induced gene 3 (granulocyte) Fcer1G Fc fragment of IgE, high affinity I, Fcgr2A Fc fragment of IgG, low affinity IIa, receptor for; gamma polypeptide receptor (CD32) Glipr1 GLI pathogenesis-related 1 (glioma) Gpr97 G protein-coupled receptor 97 Icoslg inducible T-cell co-stimulator ligand Ifitm1 interferon induced transmembrane protein 1 (9-27) Ifitm2 interferon induced transmembrane Ifitm6 interferon induced transmembrane protein 6 protein 2 (1-8D) Ikbkap inhibitor of kappa light polypeptide Il10Rb interleukin 10 receptor, beta gene enhancer in B-cells, kinase complex- associated protein Il13Ra1 interleukin 13 receptor, alpha 1 Il1B interleukin 1, beta Il1F9 interleukin 1 family, member 9 Il1R2 interleukin 1 receptor, type II Il1Rap interleukin 1 receptor accessory protein Il1Rn interleukin 1 receptor antagonist Il8Rb interleukin 8 receptor, beta Irg1 immunoresponsive 1 homolog (mouse) Klra1 killer cell lectin-like receptor subfamily Lcp1 lymphocyte cytosolic protein 1 (L-plastin) A, member 1 Lgals3 lectin, galactoside-binding, soluble, 3 Lilra6 leukocyte immunoglobulin-like receptor, subfamily A (with TM domain), member 6 Lilrb4 leukocyte immunoglobulin-like receptor, Lsp1 lymphocyte-specific protein 1 subfamily B (with TM and ITIM domains), member 4 Ly6C1 lymphocyte antigen 6 complex, locus Lyst lysosomal trafficking regulator C1 Mboat7 membrane bound O-acyltransferase Nfam1 NFAT activating protein with ITAM motif 1 domain containing 7 Nfil3 nuclear factor, interleukin 3 regulated Ngp neutrophilic granule protein Nlrp3 NLR family, pyrin domain containing 3 Pbx1 pre-B-cell leukemia homeobox 1 Pxn paxillin Rtp4 receptor (chemosensory) transporter protein 4 S100A11 (Includes Eg: 20195) S100 calcium Sell selectin L (lymphocyte adhesion molecule 1) binding protein A11 (calgizzarin) Slpi secretory leukocyte peptidase inhibitor Socs3 suppressor of cytokine signaling 3 Steap4 STEAP family member 4 Tacstd2 tumor-associated calcium signal transducer 2 Tirap toll-interleukin 1 receptor (TIR) domain Tlr1 toll-like receptor 1 containing adaptor protein Tlr4 toll-like receptor 4 Tnfsf14 tumor necrosis factor (ligand) superfamily, member 14

These genes cluster into the following GO categories: 0006952 (defense response; 25 genes; P<0.0001), 0009607 (response to biotic stimulus; 25 genes; P<0.0001), and 0006955 (immune response; 18 genes; P<0.0001). Samples from control (saline-injected) mice at 24 and 96 hours were indistinguishable using the factor CavC genes (FIG. 8A). In addition, the CavC factor was the sole predictor of infection status in a logistic regression and demonstrated a clear separation between infected and uninfected mice in the training cohort (FIG. 8B). The factor score indicated in the figure is a value that represents a marker of how strongly the CavC factor is represented in a given sample. A subsequent set of 18 infected and 5 control mice served as a validation cohort for the original experimental set. All of the experimental parameters were identical to those in the original training cohort. As in the original cohort, fungal burdens increased and animal weights decreased over time [average weight loss of 0.13±0.66 g at day 1 (24 hours) after injection, 1.16±0.6 g at day 2 (48 hours) after injection, and 3.31±2.71 g at day 3 (72 hours) after injection]. Signature validation was accomplished using sparse latent factor regression analysis (FIG. 8B).

Of the 23 mice tested (5 controls and 18 infected), only one misclassification occurred in the validation group, with one infected mouse misclassified as a control (P=0.00018, Fisher's exact test). The difference in the average factor score (weight of the expression of the probes in the CavC factor) between infected and control mice in the validation cohort was significant (P=0.0000026, analysis of variance). Thus, these data define a distinct set of genes that robustly distinguish candidemic mice from healthy controls, and it has been validated that this set of genes can classify mice with Candida infection versus healthy mice in an independent cohort.

Discriminative genes in the expression signatures represent key biology related to candidiasis. Strong evidence implicates the genes constituting the CavC factor in host defense against C. albicans. A generalized stress response or “acute-phase” response does not dominate the gene expression pattern that best distinguishes candidemic mice from healthy controls. The intersection of the key pathways identified in the CavC factor is the interleukin-1β (IL-1β) response, which is stimulated in human monocytes by exposure to C. albicans via interaction with the monocytic mannose receptor. Additionally represented is Nlrp3, a member of the Nlrp3 inflammasome multiprotein complex. This innate immune pathway plays a key role in sensing Candida infection. In contrast, known pathways of host defense against fungal infection are prominently represented in the gene set that distinguishes candidemic mice from healthy controls. Key pathways prominently represented among the genes in this factor are the Tlr4/Myd88 pathway, the Socs3/Il-8 pathway, and the Irak4/Il-1 pathways.

Example 9 Blood RNA Gene Expression Signatures are Specific for Candidemia

To determine the specificity of the candidemia signature, a group of mice with S. aureus bacteremia was compared to candidemic mice. The dose of S. aureus used to induce S. aureus bacteremia resulted in a similar morbidity curve (FIG. 1) as the dose of C. albicans used in the candidemia cohorts. The goal of this experiment was to evaluate whether mice infected with C. albicans could be differentiated from mice infected with S. aureus by blood gene expression patterns. Data from each challenge (C. albicans and S. aureus) was combined and the data was analyzed as a single data set. Data from 72 mice were included in the analysis (C. albicans, n=46; control, n=17; S. aureus, n=9). Forty factors were developed using all available probes, from which two factors (factors 3 and 21) emerged as best able to discriminate among the three groups of mice (candidemia, bacteremia, or control). Factors 3 and 21 were distinct from the CavC factor. This analysis used model averaging to predict disease state (candidemia versus bacteremia versus control), with extensive cross-validation to support these predictions. Looking across all of the models, the probability (weighted by the likelihood) of finding anyone of our factors in a predictive model was assessed. Factors 3 and 21 were responsible for almost all of the predictive accuracy of the model because they were in 98% and 99% of all models, respectively. The model was tested using 100 repetitions of 10-fold cross-validation, resulting in the small confidence intervals surrounding each prediction, as represented by the box plots in FIG. 9. FIG. 9 shows the murine blood gene expression signatures distinguish candidemia from S. aureus bacteremia and from uninfected controls. The algorithm for generating predictive models of infection with C. albicans was tested using 100 repetitions of 10-fold cross-validation. Each mouse is represented by a red cross. This analysis results in a range of possible predictions for each sample (shown as a blue box and whiskers) depending on which of the remaining samples were used to build the predictive model. Probability of a sample representing a mouse with candidemia is shown on the y axis.

Regardless of which subset of data was used in the cross-validation to generate the model, we found that there was only one false-negative sample from the mice infected with Candida and one false-positive sample from among the mice infected with S. aureus (sensitivity, 98%; specificity, 96%). These data illustrate a potential stepwise discriminative pathway for determining infectious etiology and, ultimately, the ability to distinguish between two of the most common causes of bloodstream infection in hospitalized hosts: S. aureus and C. albicans.

As expected, the up-regulated genes contained in the combined model that distinguishes candidemia from S. aureus bacteremia (factors 3 and 21) (Table 2B and 2C, respectively) clustered into the GO categories of 0009607 (response to biotic stimulus) (factor 3: 29 genes; P<0.0001) and 0006952 (defense response) (factor 21: 37 genes; P<0.0001). Genes constituting factor 3 include Bcl10, CD83, and Tlr1. Notably, Bcl10, through interactions with Card9, controls fungal-mediated myeloid cell activation via the dectin-1 pathway. Genes constituting factor 21 include genes in the Il-1 family, as well as NfkB, Tnfsfr1-1, Tlr4, and Tlr6. Pathway analysis using GeneGo corroborated these findings by organizing the genes into the Tlr3 and Tlr4 pathways (Md2, Tirap, Tlr4, CD14, and Myd88), the CD40 signaling pathway [Pdpk1, Icam1, CD23, p38MAPK, FasR (CD95)], and Mif in the innate immune response pathway (Lbp, Cd14, Tlr4, Myd88, Irak4, P38mapk, C/ebpβ, Pu.I, and IL-1β). Additional components of these factors include the IL-2 pathway and related signaling, including Elf1, Il-2, Il-2r γ chain, Socs3, Ppp3cb (calcineurin A), Egr1, and Pdpk1

TABLE 2B Candida Vs. Staph, Factor 3: where the gene name is followed by the protein it encodes Bcap29 B-Cell Receptor-Associated Protein Bcl10 B-Cell Leukemia/Lymphoma 10 29 Btg1 B-cell translocation gene 1, anti- Cbfb Core binding factor beta proliferative Ccl6 Chemokine (C—C motif) ligand 6 Cd164 CD164 antigen Cd1d1 CD1d1 antigen Cd38 CD38 antigen Cd83 Cd83 Antigen Crlf3 Cytokine receptor-like factor 3 Csk C-src tyrosine kinase Dgat1 Diacylglycerol O-acyltransferase 1 Dnajb4 Dnaj (Hsp40) Homolog, Subfamily Dnajc10 DnaJ (Hsp40) homolog, subfamily C, B, Member 4 member 10 Dnajc13 DnaJ (Hsp40) homolog, subfamily Dnajc9 DnaJ (Hsp40) homolog, subfamily C, member 9 C, member 13 Dock9 Dedicator Of Cytokinesis 9 Faf1 Fas-associated factor 1 Fcer2a Fc receptor, IgE, low affinity II, Hsp90aa1 Heat shock protein 90 kDa alpha (cytosolic), alpha polypeptide class A member 1 Hspa4 Heat shock protein 4 Hspd1 Heat shock protein 1 (chaperonin) Igbp1 Immunoglobulin (Cd79A) Binding Igk-V8 Immunoglobulin Kappa Chain Variable 8 (V8) Protein 1 Ik IK cytokine Il27Ra Interleukin 27 Receptor, Alpha Il2rg Interleukin 2 receptor, gamma chain Ilf3 Interleukin enhancer binding factor 3 Impdh2 Inosine 5′-phosphate dehydrogenase 2 Ly6a Lymphocyte antigen 6 complex, locus A Ly75 Lymphocyte antigen 75 Ly86 Lymphocyte Antigen 86 Mtcp1 Mature T-Cell Proliferation 1 Nfatc3 Nuclear factor of activated T-cells, cytoplasmic, calcincurindependent Sacs Sacsin Scye1 Small inducible cytokine subfamily E, member 1 Sdcbp Syndecan Binding Protein Sell Selectin, lymphocyte Socs5 Suppressor of cytokine signaling 5 Ssb Sjogren syndrome antigen B Tlr1 Toll-like receptor 1 Trim21 Tripartite motif protein 21

TABLE 2C Candida v. Staph, Factor 21; where the gene name is followed by the protein it encodes Adam17 A disintegrin and metallopeptidase Adam19 A disintegrin and metallopeptidase domain 17 domain 19 (meltrin beta) Adam8 A Disintegrin And Metallopeptidase Anxa2 Annexin A2 Domain 8 Bcl3 B-Cell Leukemia/Lymphoma 3 Bcl6 B-cell leukemia/lymphoma 6 Ccl6 Chemokine (C—C motif) ligand 6 Ccr1 Chemokine (C—C motif) receptor 1 Cd14 Cd14 Antigen Cd177 CD177 antigen Cd300d Cd300D antigen Cd33 CD33 antigen Cd44 CD44 antigen Cd68 CD68 antigen Cklf Chemokine-like factor Fas Fas (TNF receptor superfamily member 6) Fcgr3 Fc receptor, IgG, low affinity III Hcrtr2 Cd200 Receptor 1 Icam1 Intercellular adhesion molecule Ifitm1 Interferon Induced Transmembrane Protein 1 Ifitm2 Interferon Induced Transmembrane Ifitm6 Interferon induced transmembrane protein 6 Protein 2 Igsf6 Immunoglobulin superfamily, member 6 Ikbkap Inhibitor of kappa light polypeptide enhancer in B-cells, kinase complex-associated protein Il13ra1 Interleukin 13 receptor, alpha 1 Il17Ra Interleukin 17 Receptor Il18rap Interleukin 18 receptor accessory protein Il1B Interleukin 1 Beta Il1f9 Interleukin 1 family, member 9 Il1r2 Interleukin 1 receptor, type II Il1rap Interleukin 1 receptor accessory protein Il1Rn Interleukin 1 Receptor Antagonist Il8Rb Interleukin 8 Receptor, Beta Irak4 Interleukin-1 receptor-associated kinase 4 Irg1 Immunoresponsive gene 1 Klra2 Killer cell lectin-like receptor, subfamily A, member 2 Lcp1 Lymphocyte Cytosolic Protein 1 Lcp2 Lymphocyte cytosolic protein 2 Leng4 Leukocyte Receptor Cluster (Lrc) Member 4 Lgals3 Lectin, galactose binding, soluble 3 Lilrb3 Leukocyte immunoglobulin-like receptor, Ly96 Lymphocyte Antigen 96 subfamily B (with TM and ITIM domains), member 3 Lyst Lysosomal trafficking regulator Mapk14 Mitogen activated protein kinase 14 Myd88 Myeloid Differentiation Primary Nfil3 Nuclear factor, interleukin 3, regulated Response Gene 88 Ngp Neutrophilic granule protein Nlrp3 NLR family, pyrin domain containing 3 Pxn Paxillin Sell Selectin, lymphocyte Sept5 Septin 5 Slpi Secretory Leukocyte Peptidase Inhibitor Socs3 Suppressor of cytokine signaling 3 Stat3 Signal transducer and activator of transcription 3 Tirap Toll-interleukin 1 receptor (TIR) domain- Tlr4 Toll-like receptor 4 containing adaptor protein Tlr6 Toll-like receptor 6 Tnfrsf1a Tumor necrosis factor receptor superfamily, member 1a Tnfsf14 Tumor Necrosis Factor (Ligand) Superfamily, Member 14

Example 10 Blood RNA Gene Expression Signatures can Differentiate Between Early and Late IC

Detection of early versus late disease is a critical characteristic for a gene expression signature to become an effective diagnostic test in the clinical setting and possibly allow a prognostic evaluation of disease outcome. Thus, the varying time points after infection (days 1 to 4) were used as surrogates for infection severity, ranging from early disease (day 1) to late disease (days 3 to 4), on the basis of both the dose-ranging study morbidity curves generated in our laboratory (FIG. 1) and data documenting that mice injected with C. albicans via the tail vein die of progressive sepsis was published (B. Spellberg, A. S. Ibrahim, J. E. Edwards Jr. S. G. Filler, Mice with disseminated candidiasis die of progressive sepsis. J. Infect. Dis. 192, 336-343 (2005)). As described above, factors were generated without regard to infection type or duration.

Murine blood gene expression signatures classify day after injection, a potential surrogate for infection severity. Severity of infection (as represented by days after tail vein injection) can be predicted by gene expression signatures using an unsupervised analysis of the candidemic mice and controls alone. Three factors are strongly associated with early infection (24 hours; P=2.7×10⁻⁷), mid-infection (48 hours; P=2.1×10⁻⁸), and late infection (72 to 96 hours; P=1.7×10⁻⁸). The factors were used as predictors of their respective phenotypes, and a cutoff was used, which maximizes the sum of sensitivity and specificity. FIG. 10 shows that gene expressions signatures can classify day after injection, a surrogate for infection severity. Novel factors are able to classify early (day 1, i.e. 24-hour), mid (day 2, i.e. 48-hour) and later (day 3 to 4) following injection with a high degree of accuracy. As shown in FIG. 10A, the 24-hour factor correctly identified 13 of 14 of the 24-hour samples and 46 of 49 of the remaining samples (control and days 2 to 4) (sensitivity, 93%; specificity, 94%). As shown in FIG. 10B, the midcourse infection (48 hours) factor identified 14 of 14 of the 48-hour samples and misclassified 2 of 49 of the remaining samples (sensitivity, 100%; specificity, 96%). As shown in FIG. 10C, the late infection factor identified 15 of 18 (sensitivity, 83%) of the day 3 to 4 mice and 44 of 45 (specificity, 98%) of the remaining samples. Using a nonlinear model in a supervised fashion, the predictive capacity of the “days after injection” factors (FIG. 10D) was determined. Note that the scatter along the y axis is due to differences in estimated time of infection, whereas the scatter along the x axis is artificial for the purposes of improved visualization.

Among the 40 factors developed for this model, we discovered factors that strongly associate with early infection (day 1 factor: 9 up-regulated genes and 36 down-regulated genes; P=2.7×10⁻⁷, Student's t test), midpoint infection (day 2 factor: 122 up-regulated genes and 4 down-regulated genes; P=2.1×10⁻⁸, Student's t test), and progressive infection (day 3 factor: 79 up-regulated genes and 4 down-regulated genes; day 4 factor: 31 up-regulated genes and 32 down-regulated genes; P=1.7×10⁻⁸, Student's t test) (FIG. 10A-D). These factors were distinct from the CavC factor as well as from factors 3 and 21 described above. Notably, in the initial cohort of mice infected with C. albicans, a robust distinction between day 3 and day 4 after injection was not possible using any factor. Therefore, mice used for validation studies were only killed at days 1, 2, or 3 after infection.

In day 1, there is increased expression of CXCL2 and CXCL13, two chemokines produced by macrophages to attract neutrophils and B cells, respectively, to the site of inflammation. CXCL2, also known as MIP2, is activated by macrophage exposure to fungi, including Candida and Aspergillus (G. Overland, J. F. Stuestol, M. K. Dahle, A. E. Myhre, M. G. Netea, P. Verweij, A. Yndestad, P. Aukrust, B. J. Kullberg, A. Warris, J. E. Wang, A. O. Aasen, Cytokine responses to fungal pathogens in Kupffer cells are Toll-like receptor 4 independent and mediated by tyrosine kinases. Scand. J. Immunol 62, 148-154 (2005)). There was also up-regulation of receptor for IL-I0, which directs the immune response toward the T helper 2 pathway in IC. In day 2, key genes expressed were in the tumor necrosis factor (TNF)-nuclear factor κB (NFκB)-Bcl2 pathway (antiapoptotic), including Tnfα, TNFr1, CD40, Tac1, Tweak, April, Ikkβ, IkB, NFκB, Bfl1, Relb, NfκB2 (p52), and NFκB2 (p100). Analysis of the day 3 and 4 factors revealed down-regulation of apoptotic genes upstream of caspase-3, including Apaf1, Bad, and Ppp3cb. In addition, there was evidence of involvement of Toll-like receptor pathways with up-regulation of IRF3, which is specific for TLR3/4-mediated immune response. TLR4 has a well-established role in recognizing the Candida cell wall by binding to O-linked mannosyl moieties (M. G. Netea, N. A. Gow, C. A Munro, S. Bates, C. Colins, G. Ferwerda, R. P. Hobson, G. Bertram, H. B. Hughes, T. Jansen, L. Jacobs, E. T. Buurman, K. Gijzen, D. L. Williams, R. Torensma, A. McKinnon, D. M. MacCallum, F. C. Odds, J. W. Van der Meer, A. J. Brown, B. J. Kullberg, Immune sensing of Candida aibicans requires cooperative recognition of mannans and glucans by lectin and Toll-like receptors. J. Clin. Invest. 116, 1642-1650 (2006)). Up-regulation of inflammatory cytokines, such as CXCL4, IL-10, and IL-2Ry, and genes involved in NFκB-dependent and mitogen-activated protein kinase-dependent signaling cascades occurs at this point. Notably, NFAT, the nuclear factor of activated T cells, is also up-regulated in the late-stage disease factor. NFAT is a broad immunoresponsive signal recently implicated in host response to Candida infection. Dectin-1 ligation by C. albicans triggers NFAT activation in macrophages and dendritic cells. Subsequently, the growth response of these cells is activated, as well as IL-2, IL-20, and IL-12p70 (H. S. Goodridge, R. M. Simmons, D. M. Underhill, Dectin-1 stimulation by Candida albicans yeast or zymosan triggers NFAT activation in macrophages and dendritic cells. J. Immunol. 178, 3107-3115 (2007)).

The day 1 factor was able to correctly classify 13 of 14 of the day 1 samples and 46 of 49 of the remaining samples (control and days 2 to 4) (sensitivity, 93%; specificity, 94%) using the day-specific factors by themselves as predictors of their respective phenotypes, and using a cutoff that maximizes the sum of sensitivity and specificity (FIG. 10A and Table 2D). Table 3 shows the sensitivity and specificity of the predictive model for candidemia. Furthermore, statistical data on the strength of the individual factors associated with the duration of infection are provided. For midcourse infection (day 2), the discriminative factor identified 14 of 14 of the day 2 samples and misclassified 2 of 49 of the remaining samples (sensitivity, 100%; specificity, 96%) (FIG. 10B and Table 2E). Finally, the late infection factor identified 15 of 18 (sensitivity, 83%) of the day 3 to 4 mice and 44 of 45 (specificity, 98%) of the remaining mice (FIG. 10C and Table 2F).

TABLE 2D Day 1 Genes, where the gene name is followed by the protein it encodes Cd177 CD177 antigen Cxcl13 Chemokine (C—X—C motif) ligand 13 Cxcl2 Chemokine (C—X—C Motif) Ligand 2 Glipr2 GLI pathogenesis-related 2 Gpr97 G Protein-Coupled Receptor 97 Hspb1 Heat Shock Protein 1 Ifitm6 Interferon induced transmembrane protein 6 Il10rb Interleukin 10 receptor, beta Il1r2 Interleukin 1 receptor, type II

TABLE 2E Day 2 Genes, where the gene name is followed by the protein it encodes Adam17 A Disintegrin And Metallopeptidase Anxa2 Annexin A2 Domain 17 Apaf1 Apoptotic Peptidase Activating Factor 1 Bcl2A1A B-Cell Leukemia/Lymphoma 2 Related Protein A1A Bcl2A1B B-Cell Leukemia/Lymphoma 2 Bcl2A1D B-Cell Leukemia/Lymphoma 2 Related Related Protein A1B Protein A1D Bcl6 B-Cell Leukemia/Lymphoma 6 Ccr2 Chemokine (C—C Motif) Receptor 2 Cd180 Cd180 Antigen Cd274 Cd274 Antigen Cd300A Cd300A Antigen Cd300D Cd300D Antigen Cd40 Cd40 Antigen Cd44 Cd44 Antigen Cd48 Cd48 Antigen Cd52 Cd52 Antigen Cd53 Cd53 Antigen Cd68 Cd68 Antigen Cd97 Cd97 Antigen Cdkn1A Cyclin-Dependent Kinase Inhibitor 1A (P21) Cebpb Ccaat/Enhancer Binding Protein Csf2Ra Colony Stimulating Factor 2 Receptor, (C/Ebp), Beta Alpha, Low-Affinity (Granulocyte) Ctsb Cathepsin B Dock2 Dedicator Of Cyto-Kinesis 2 Dock8 Dedicator Of Cytokinesis 8 Gab1 Growth Factor Receptor Bound Protein 2- Associated Protein 1 H2-M3 Histocompatibility 2, M Region Locus 3 Hprt1 Hypoxanthine Guanine Phosphoribosyl Transferase 1 Hspa8 Heat Shock Protein 8 Icam1 Intercellular Adhesion Molecule Ifi202B Interferon Activated Gene 202 Ifi30 Interferon Gamma Inducible Protein 30 Ifitm6 Interferon Induced Transmembrane Ifnar1 Interferon (Alpha And Beta) Receptor 1 Protein 6 Ifngr1 Interferon Gamma Receptor 1 Ifngr2 Interferon Gamma Receptor 2 Igsf6 Immunoglobulin Superfamily, Member 6 Ikbkb Inhibitor Of Kappab Kinase Beta Il10Ra Interleukin 10 Receptor, Alpha Il17Ra Interleukin 17 Receptor Il18 Interleukin 18 Il2Rg Interleukin 2 Receptor, Gamma Chain Il2Rg Interleukin 2 Receptor, Gamma Chain Inpp5D Inositol Polyphosphate-5-Phosphatase D Irf5 Interferon Regulatory Factor 5 Irf8 Interferon Regulatory Factor 8 Itgal Integrin Alpha L Itgav Integrin Alpha V Itgb2 Integrin Beta 2 Lair1 Leukocyte-Associated Ig-Like Receptor 1 Lat2 Linker For Activation Of T Cells Family, Lcp1 Lymphocyte Cytosolic Protein 1 Member 2 Lgals3 Lectin, Galactose Binding, Soluble 3 Lilrb3 Leukocyte Immunoglobulin-Like Receptor, Subfamily B (With Tm And Itim) Lst1 Leukocyte Specific Transcript 1 Ly6C1 Lymphocyte Antigen 6 Complex, Locus C Ly6I Lymphocyte Antigen 6 Complex, Locus I Ly86 Lymphocyte Antigen 86 Lyz P Lysozyme Structural Lyzs Lysozyme Map4K1 Mitogen Activated Protein Kinase Map4K4 Nck Interacting Kinase Kinase Kinase Kinase 1 Map4K4 Nck Interacting Kinase Myd88 Myeloid Differentiation Primary Response Gene 88 Nfkb2 Nuclear Factor Of Kappa Light Nlrp3 Cold Autoinflammatory Syndrome 1 Polypeptide Gene Enhancer In BCells Homolog (Human) Pilrb1 Paired Immunoglobin-Like Type 2 Prdx1 Peroxiredoxin 1 Receptor Beta Scye1 Small Inducible Cytokine Subfamily E, Spn Sialophorin Member 1 Strn3 Striatin, Calmodulin Binding Protein 3 Tcea1 Transcription Elongation Factor A (Sii) 1 Tcirg1 T-Cell, Immune Regulator 1, Atpase, Tgfb1 Transforming Growth Factor, Beta 1 H+ Transporting, Lysosomal V0 Prot . . . Tiam1 T-Cell Lymphoma Invasion And Tnf Tumor Necrosis Facto Metastasis 1

TABLE 2F Day 3-4 Genes, where the gene name is followed by the protein it encodes Atf6 Activating transcription factor 6 Bcap31 B-cell receptor-associated protein 31 Bcl10 B-Cell Leukemia/Lymphoma 10 Casp3 Caspase 3 Cd151 CD151 antigen Cd151 CD151 antigen Cd164 CD164 antigen Cd24a CD24a antigen Cd274 CD274 antigen Cd40 Cd40 Antigen Cd44 CD44 antigen Cd47 CD47 antigen (Rh-related antigen, integrin- associated signal transducer) Cd7 CD7 antigen Cd81 Cd 81 Antigen Cd84 Cd84 Antigen Cd9 CD9 antigen Cd93 Complement Component 1, Q Cdh23 Cadherin 23 (otocadherin) Subcomponent, Receptor 1 Chrne Cholinergic receptor, nicotinic, epsilon Crtam Cytotoxic and regulatory T cell molecule polypeptide Ctse Cathepsin E Cxcl4 Chemokine (C—X—C Motif) Ligand 4 Dgat1 Diacylglycerol O-acyltransferase 1 Dnajb4 Dnaj (Hsp40) Homolog, Subfamily B, Member 4 Fkbp4 FK506 binding protein 4 Glipr2 GLI pathogenesis-related 2 Glud1 Glutamate dehydrogenase 1 Gsk3b Glycogen synthase kinase 3 beta Hdac5 Histonc Deacetylase 5 Hsbp1 Heat shock factor binding protein 1 Hscb Expressed Sequence Aw049829 Hsp110 Heat Shock Protein 110 Hsp110 Heat Shock Protein 110 Hspa9 Heat shock protein 9 Ifrd1 Interferon-related developmental regulator 1 Igtp Interferon gamma induced GTPase Il1B Interleukin 1 Beta Il2rg Interleukin 2 receptor, gamma chain Ilf3 Interleukin enhancer binding factor 3 Impdh2 Inosine 5′-phosphate dehydrogenase 2 Inpp5d Inositol polyphosphate-5-phosphatase D Irf3 Interferon regulatory factor 3 Isg20 Interferon-Stimulated Protein Itgal Integrin alpha L Itgav Integrin Alpha V Itgb2 Integrin beta 2 Itpkb Inositol 1,4,5-trisphosphate 3-kinase B Jak2 Janus kinase 2 Lat Linker for activation of T cells Lcp1 Lymphocyte Cytosolic Protein 1 Ly6G6C Lymphocyte Antigen 6 Complex, Locus Map2k3 Mitogen activated protein kinase kinase 3 G6C Map2k3 Mitogen activated protein kinase kinase 3 Mapk14 Mitogen activated protein kinase 14 Muc13 Mucin 13, epithelial transmembrane NA N-Myc Downstream Regulated Gene 1 Nfkb1 Nuclear Factor Of Kappa Light Chain Pcna Proliferating Cell Nuclear Antigen Gene Enhancer In B-Cells 1, P105 Phb2 Prohibitin 2 Prdx2 Peroxiredoxin 2 Sell Selectin, lymphocyte Slamf1 Signaling lymphocytic activation molecule family member 1 Slc44A1 Solute Carrier Family 44, Member 1 Slpi Secretory Leukocyte Peptidase Inhibitor Tgfb1 Transforming Growth Factor, Beta 1 Tollip Toll interacting protein

TABLE 3 Sensitivity and specificity of blood gene expression signatures of candidemia. Factor Sensitivity (%) Specificity (%) CavC 98 96 24 hours 93 94 48 hours 100 96 72-96 hours   83 98

The genes identified in discriminant analysis for Candida infection are well known to be part of host response to fungal infection. Pdpk1 (3-phosphoinositide-dependent protein kinase-1) is involved with cellular phagocytosis of fungal conidia, with inhibition leading to failed uptake of Aspergillus fumigatus conidia in vitro (K. Luther, M. Rohde, K Sturm, A Kotz, J. Heesemann, F. Ebel, Characterisation of the phagocytic uptake of Aspergillus fumigatus conidia by macrophages. Microbes Infect. 10, 175-184 (2000)). The representation of Ppp3cb links the signature to the multiple lines of evidence implicating the calcineurin pathway with host antifungal defense (W. J. Steinbach, R. A. Cramer Jr., B. Z. Perfect, C. Henn, K. Nielsen, J. Heitman, J. R. Perfect, Calcineurin inhibition or mutation enhances cell wall inhibitors against Aspergillus fumigatus. Antimicrob. Agents Chemother. 51, 2979-2981 (2007)). Although no gene or gene product is 100% specific to antifungal host defense, the power of this analysis lies in deriving predictive combinations of genes that classify specific types of infection (that is, candidemia versus bacteremia). These data further corroborate this host-based approach to distinguish different disease states in response to pathogens.

The response of Candida to interaction with host immune cells has been explored extensively (C. Fradin, P. De Groot, D. MacCallum, M. Schaller, F. Klis, F. C. Odds, B. Hube, Granulocytes govern the transcriptional response, morphology and proliferation of Candida albicans in human blood. Mol. Microbiol. 56, 397-415 (2005); C. Fradin, M. Kretschmar, T. Nichterlein, C. Gailardin, C. d'Enfert, B. Hube, Stage-specific gene expression of Candida albicans in human blood. Mol. Microbiol. 47, 1523-1543 (2003); M. C. Lorenz, J. A. Bender, G. R. Fink, Transcriptional response of Candida albicans upon internalization by macrophages. Eukaryot. Cell 3, 1076-1087 (2004); I. Rubin-Bejerano, I. Fraser, P. Grisafi, G. R. Fink, Phagocytosis by neutrophils induces an amino acid deprivation response in Saccharomyces cerevisiae and Candida albicans. Proc Natl. Acad. Sci. U.S.A 100, 11007-11012 (2003)), whereas few studies have evaluated host response to C. albicans.

The early host response to C. albicans was evaluated using ex vivo stimulation of human neutrophils and mononuclear cells with C. albicans yeast and hyphal forms. This study stimulated neutrophils for 1 hour and found that 191 differentially expressed genes characterized the neutrophil response to C. albicans. Although it is difficult to compare directly a 1-hour cell type-specific incubation with a murine systemic exposure, clear similarities in the responses are evident (that is, representation of leukemia inhibitory factor, CXCL2, IRF5, and stress response pathways). Notably, the immunological profile in the murine kidney and spleen in response to C. albicans infection has also been characterized using multicomponent enzyme-linked immunosorbent assay (ELISA). Although the differences in target organ tested (blood versus kidney) and methodology (microarray versus ELISA) make direct comparison between our study and this work difficult, it is notable that MacCallum (D. M. MacCallum, Massive induction of innate immune response to Candida albicans in the kidney in a murine intravenous challenge model. FEMS Yeast Res. 9, 1111-1122 (2009)) found compartmentalization of host response to C. albicans infection, with differential expression of cytokines and chemokines tested between the spleen and kidney. This work, in conjunction with ours, highlights the use of emerging technologies to dissect the host response to candidemia.

Example 11 Blood RNA Gene Expression Signatures can Predict Progressive Infection

Having developed distinct factors associated with samples at different points in the course of infection, a model has been developed that combined all previously identified day-specific factors to determine the elapsed time after infection. There are some factors that exhibit interesting and important variation that is neither increasing nor decreasing for the entire course of the disease (FIG. 10B). This structure (differentially expressed only at one time point) precludes the use of a standard linear model to predict infection duration. A linear model in the context of categorical outcomes (control, days 1 to 4) may also be used. Cubic splines were used to build a predictive model. For each sample, the set of factor scores for the k most important factors as a point in k-dimensional Euclidean space were envisioned. A cubic spline with knots at the mean for each time point was then fit. Each point on this curve is then identified with infection duration, and for each sample, the predicted duration is the one associated with the closest point on the curve to that sample. The model was validated with “leave-one-out” cross-validation, at each step leaving out a sample, rechoosing the important factors based on the remaining samples, and rebuilding the model based on those factors. The result of this analysis is presented in FIG. 10D. It shows that prediction of the duration of infection is quite robust, with only three observations off by more than a day. The predictive capacity of the “days after injection” factors were determined using a nonlinear model in a supervised fashion and the entire probe set. (see FIG. 14).

Comparing the factors from days 1 to 4 not only reveals the genes and pathways involved in the immune response to candidemia but also provides information regarding the evolution of host response during increasing time of disease. An evolving host response across time is suggested; the candidemic mouse model provides a dynamic nature of an immune response that can be captured via transcriptional profiling. Genes expressed in “Early disease,” such as the macrophage yeast interaction genes CXCL2 and CXCL3, and IL-10, have established roles in early phases of an immune response. Genes expressed in “Mid-stage disease,” such as the inflammation gene TNF/NFκB, the antiapoptotic gene BCL2, interferon-γ, and cathepsin B, representing worsening disease burden, revealed a more robust response involving elements of both the innate and the adaptive immune response. “Late disease” representative of the premorbid state, was characterized by up-regulation of caspase-3, a protease involved in apoptosis, which is not up-regulated in mid-stage disease. Other “Late disease” gene examples include the innate immunity genes TLR4/TLR3 and the dendritic cell activation gene NFAT2. With further studies of survival and impact of antifungal treatment, the potential exists for identifying transcriptional programs and mechanisms affecting host prognosis with candidemia.

PROPHETIC EXAMPLES Example 12 A Diagnostic Strategy for Assessing Candida Infection Using Gene Expression Signatures

Host response signatures have several potential clinical applications (FIG. 11), including early diagnostic testing and for serial monitoring of extremely high-risk patients. In the first instance, being able to quickly and accurately distinguish candidemia from bacteremia may have important benefits, as pathogen-appropriate therapy can be initiated at the earliest possible time point. Predictive signatures will be used clinically under this proposed algorithm. Predictive signatures will be used in a “point of care” manner as illustrated by the left side of the chart in FIG. 11. An additional approach to using these signatures is serial monitoring of high-risk individuals during a defined period, with detection of a predictive signature triggering further testing or empiric therapy.

Such early initiation of therapy is imperative for improving patient outcomes. Moreover, by using the host response to infection as a means of determining pathogen class, this method can also provide a compliment to traditional microbiologic techniques, such as culture or antigen detection. Compelling evidence exists in the literature that the optimal diagnostic paradigm for invasive fungal infections is likely a combination of tests (F. F. Alam, A. S. Mustafa, Z. U. Khan, Comparative evaluation of (1,3)-β-D-glucan, mannan and anti-mannan antibodies, and Candida species-specific snPCR in patients with candidemia. BMC Infect. Dis. 7, 103 (2007); F. Botterel, C. Farrugia, P. Ichai, J. M. Costa, F. Saliba, S. Bretagne, Real-time PCR on the first galactomannan-positive serum sample for diagnosing invasive aspergillosis in liver transplant recipients. Transpl. infect. Dis. 10, 333-338 (2008)). Integration with yeast biomarkers such as β-glucan or fungal polymerase chain reaction may make these combined host-pathogen-based tests precise early diagnostic strategies to improve the 20 to 40% mortality rate of IC. In the second instance, such signatures will be used in a serial monitoring strategy in high-risk individuals. With scheduled sampling during high-risk periods, the development of a “candidemia” signature will trigger further diagnostic testing (for example, blood cultures and β-glucan) or empiric antifungal therapy.

There is precedent for validating murine-derived disease state-specific gene expression signatures in human cohorts (H. K. Dressman, G. G. Muramoto, N. J. Chao, S. Meadows, D. Marshall, G. S. Ginsburg, J. R. Nevins, J. P. Chute, Gene expression signatures that predict radiation exposure in mice and humans. PLoS Med. 4, e106 (2007)). Thus, subject to validation in a human cohort, these gene expression signatures will be used in a preemptive monitoring strategy or as an aid to direct preemptive antifungal therapy in appropriate hosts.

The rigorous cross-validation applied across experimentally infected cohorts described in this work illustrates the robust nature of the blood response to Candida bloodstream infection. Validation of the gene expression signatures in high-risk human cohorts, both at time of diagnosis of candidemia and in a preemptive monitoring setting, is clearly needed to elevate these findings to true diagnostic and prognostic tools. Additionally, such data would be extremely valuable if they could be used to either diagnose infection class before standard microbiologic studies (that is, in the early phases of disease) or indicate prognosis after disease acquisition or therapeutic intervention.

Thus, the invention provides, among other things, methods for identifying infectious disease and assays for identifying infectious disease. Various features and advantages of the invention are set forth in the following claims. 

1.-62. (canceled)
 63. A method of identifying and treating a subject having candidiasis comprising: a) determining gene expression levels of candidemia versus control (“CavC”) genes in a peripheral blood cell sample of the subject wherein the CavC genes consist of Adam19, Adam8, Anxa2, Bcl6, Bst1, Ccr1, Cd14, Cd177, Cd300Lf, Cd33, Cd52, Cklf, Crlf2, Csf2Rb, Csf3R, Ebi3, Fcer1G, Fcgr2A, Glipr1, Gpr97, Icoslg, Ifitm1, Ifitm2, Ifitm6, Ikbkap, Il10Rb, Il13Ra1, Il1B, Il1F9, Il1R2, II1Rap, Il1Rn, Il8Rb, Irg1, Klra1, Lcp1, Lgals3, Lilra6, Lilrb4, Lsp1, Ly6C1, Lyst, Mboat7, Nfam1, Nfil3, Ngp, Nlrp3, Pbx1, Pxn, Rtp4, S100A11, Sell, Slpi, Socs3, Steap4, Tacstd2, Tirap, Tlr1, Tlr4, and Tnfsf14; b) comparing the gene expression levels of the CavC genes to standard gene expression levels wherein the standard gene expression levels corresponds to the gene expression levels for the CavC genes in a subject that does not have candidiasis, wherein the gene expression levels of mRNA is measured, c) using the expression levels of each of the CavC genes in the peripheral blood cell sample as features for a logistic regression model to generate a factor score; d) identifying the subject as having candidiasis if the factor score is higher than a factor score for a control, wherein the candidiasis is candidemia; and e) administering an effective amount of an antifungal therapy to treat the subject identified as having candidiasis; wherein the subject is human or mice.
 64. The method of claim 63, wherein step (a) comprises a method selected from the group consisting of assaying the sample with an array comprising a plurality of nucleic acid oligomers, Northern blot hybridization analysis, RT-PCR, and quantitative RT-PCR. 